159 research outputs found
Parallel netCDF: A Scientific High-Performance I/O Interface
Dataset storage, exchange, and access play a critical role in scientific
applications. For such purposes netCDF serves as a portable and efficient file
format and programming interface, which is popular in numerous scientific
application domains. However, the original interface does not provide an
efficient mechanism for parallel data storage and access. In this work, we
present a new parallel interface for writing and reading netCDF datasets. This
interface is derived with minimum changes from the serial netCDF interface but
defines semantics for parallel access and is tailored for high performance. The
underlying parallel I/O is achieved through MPI-IO, allowing for dramatic
performance gains through the use of collective I/O optimizations. We compare
the implementation strategies with HDF5 and analyze both. Our tests indicate
programming convenience and significant I/O performance improvement with this
parallel netCDF interface.Comment: 10 pages,7 figure
Parallel Implementation of Lossy Data Compression for Temporal Data Sets
Many scientific data sets contain temporal dimensions. These are the data
storing information at the same spatial location but different time stamps.
Some of the biggest temporal datasets are produced by parallel computing
applications such as simulations of climate change and fluid dynamics. Temporal
datasets can be very large and cost a huge amount of time to transfer among
storage locations. Using data compression techniques, files can be transferred
faster and save storage space. NUMARCK is a lossy data compression algorithm
for temporal data sets that can learn emerging distributions of element-wise
change ratios along the temporal dimension and encodes them into an index table
to be concisely represented. This paper presents a parallel implementation of
NUMARCK. Evaluated with six data sets obtained from climate and astrophysics
simulations, parallel NUMARCK achieved scalable speedups of up to 8788 when
running 12800 MPI processes on a parallel computer. We also compare the
compression ratios against two lossy data compression algorithms, ISABELA and
ZFP. The results show that NUMARCK achieved higher compression ratio than
ISABELA and ZFP.Comment: 10 pages, HiPC 201
A Java Graphical User Interface for Large-Scale Scientific Computations in Distributed Systems
Large-scale scientific applications present great challenges to computational scientists in terms of obtaining high performance and in managing large datasets. These applications (most of which are simulations) may employ multiple techniques and resources in a heterogeneously distributed environment. Effective working in such an environment is crucial for modern large-scale simulations. In this paper, we present an integrated Java graphical user interface (IJ-GUI) that provides a control platform for managing complex programs and their large datasets easily. As far as performance is concerned, we present and evaluate our initial implementation of two optimization schemes: data replication and data prediction. Data replication can take advantage of \u27temporal locality\u27 by caching the remote datasets on local disks; data prediction, on the other hand, provides prefetch hints based on the datasets\u27 past activities that are kept in databases. We first introduce the data contiguity concept in such an environment that guides data prediction. The relationship between the two approaches is discussed
Environment Diversification with Multi-head Neural Network for Invariant Learning
Neural networks are often trained with empirical risk minimization; however,
it has been shown that a shift between training and testing distributions can
cause unpredictable performance degradation. On this issue, a research
direction, invariant learning, has been proposed to extract invariant features
insensitive to the distributional changes. This work proposes EDNIL, an
invariant learning framework containing a multi-head neural network to absorb
data biases. We show that this framework does not require prior knowledge about
environments or strong assumptions about the pre-trained model. We also reveal
that the proposed algorithm has theoretical connections to recent studies
discussing properties of variant and invariant features. Finally, we
demonstrate that models trained with EDNIL are empirically more robust against
distributional shifts.Comment: In Proceedings of 36th Conference on Neural Information Processing
Systems (NeurIPS 2022
Design, implementation, and evaluation of parallell pipelined STAP on parallel computers
Performance results are presented for the design and implementation of parallel pipelined space-time adaptive processing (STAP) algorithms on parallel computers. In particular, the issues involved in parallelization, our approach to parallelization, and performance results on an Intel Paragon are described. The process of developing software for such an application on parallel computers when latency and throughput are both considered together is discussed and tradeoffs considered with respect to inter and intratask communication and data redistribution are presented. The results show that not only scalable performance was achieved for individual component tasks of STAP but linear speedups were obtained for the integrated task performance, both for latency as well as throughput. Results are presented for up to 236 compute nodes (limited by the machine size available to us). Another interesting observation made from the implementation results is that performance improvement due to the assignment of additional processors to one task can improve the performance of other tasks without any increase in the number of processors assigned to them. Normally, this cannot be predicted by theoretical analysis
- …